infinity norm
the tangent kernel cannot be explained from the point of view of "lazy training": when the last layer is non-linear, the
We thank all reviewers for the insightful and encouraging comments. Theorem 3.2 and results in Appendix G has been proved previously (e.g., [1]). Our Hessian analysis results, including Theorem 3.2 and Theorem 3.1, are new. This can perhaps cause confusion. The paper mostly focuses on squared loss while widely applied NNs use softmax-cross entropy loss.
Analysis of Value Iteration Through Absolute Probability Sequences
Mustafin, Arsenii, Colla, Sebastien, Olshevsky, Alex, Paschalidis, Ioannis Ch.
Value Iteration is a widely used algorithm for solving Markov Decision Processes (MDPs). While previous studies have extensively analyzed its convergence properties, they primarily focus on convergence with respect to the infinity norm. In this work, we use absolute probability sequences to develop a new line of analysis and examine the algorithm's convergence in terms of the $L^2$ norm, offering a new perspective on its behavior and performance.
Fast exact recovery of noisy matrix from few entries: the infinity norm approach
The matrix recovery (completion) problem, a central problem in data science and theoretical computer science, is to recover a matrix $A$ from a relatively small sample of entries. While such a task is impossible in general, it has been shown that one can recover $A$ exactly in polynomial time, with high probability, from a random subset of entries, under three (basic and necessary) assumptions: (1) the rank of $A$ is very small compared to its dimensions (low rank), (2) $A$ has delocalized singular vectors (incoherence), and (3) the sample size is sufficiently large. There are many different algorithms for the task, including convex optimization by Candes, Tao and Recht (2009), alternating projection by Hardt and Wooters (2014) and low rank approximation with gradient descent by Keshavan, Montanari and Oh (2009, 2010). In applications, it is more realistic to assume that data is noisy. In this case, these approaches provide an approximate recovery with small root mean square error. However, it is hard to transform such approximate recovery to an exact one. Recently, results by Abbe et al. (2017) and Bhardwaj et al. (2023) concerning approximation in the infinity norm showed that we can achieve exact recovery even in the noisy case, given that the ground matrix has bounded precision. Beyond the three basic assumptions above, they required either the condition number of $A$ is small (Abbe et al.) or the gap between consecutive singular values is large (Bhardwaj et al.). In this paper, we remove these extra spectral assumptions. As a result, we obtain a simple algorithm for exact recovery in the noisy case, under only three basic assumptions. This is the first such algorithm. To analyse the algorithm, we introduce a contour integration argument which is totally different from all previous methods and may be of independent interest.
Reviews: Certified Adversarial Robustness with Additive Noise
I have just some further comments. The certified bound for L_infty 0.3 for MNIST shown in Figure 2 shows that it is approximately 70% accuracy? Whereas TRADES seems to be closer to 100% and Gowal et al is above 90% - it seems low compared to the numbers I am used to. This might be due to the bound being too loose. I definitely agree that the goal of the adversary is to find an image where the difference is imperceptible to the human eye, however, when the perturbation radius is larger we should be less sure that **all** images within this space are imperceptible to the original.
The Condorcet Dimension of Metric Spaces
Lassota, Alexandra, Vetta, Adrian, von Stengel, Bernhard
An ideal winner in such an election is a Condorcet winner, a candidate that "beats" any other candidate in a pairwise voting contest. Formally, i C is a Condorcet winner if, for every candidate j C\{i}, more than half of the voters prefer i over j. To avoid ties, we assume the number n of voters is odd. Unfortunately, it is easy to construct elections where a Condorcet winner does not exist; indeed typically there is no Condorcet winner. Given this, Elkind, Lang and Saffidine [12] proposed a relaxation where, rather than a single winning candidate, we desire a set of winning candidates that collectively beats any other candidate. Specifically, a set of candidates S C is a Condorcet winning set if, for every candidate j C \S, more than half of the voters prefer some (voter-dependent) candidate in S to j. Elkind et al. [12] showed that a Condorcet winning set always exists provided we allow the winning set to be large enough.